Manifold learning on handwritten digits: Locally Linear Embedding, Isomap…

您所在的位置:网站首页 manifold learning Manifold learning on handwritten digits: Locally Linear Embedding, Isomap…

Manifold learning on handwritten digits: Locally Linear Embedding, Isomap…

#Manifold learning on handwritten digits: Locally Linear Embedding, Isomap… | 来源: 网络整理| 查看: 265

Embedding techniques comparison¶

Below, we compare different techniques. However, there are a couple of things to note:

the RandomTreesEmbedding is not technically a manifold embedding method, as it learn a high-dimensional representation on which we apply a dimensionality reduction method. However, it is often useful to cast a dataset into a representation in which the classes are linearly-separable.

the LinearDiscriminantAnalysis and the NeighborhoodComponentsAnalysis, are supervised dimensionality reduction method, i.e. they make use of the provided labels, contrary to other methods.

the TSNE is initialized with the embedding that is generated by PCA in this example. It ensures global stability of the embedding, i.e., the embedding does not depend on random initialization.

from sklearn.decomposition import TruncatedSVD from sklearn.discriminant_analysis import LinearDiscriminantAnalysis from sklearn.ensemble import RandomTreesEmbedding from sklearn.manifold import ( Isomap, LocallyLinearEmbedding, MDS, SpectralEmbedding, TSNE, ) from sklearn.neighbors import NeighborhoodComponentsAnalysis from sklearn.pipeline import make_pipeline from sklearn.random_projection import SparseRandomProjection embeddings = { "Random projection embedding": SparseRandomProjection( n_components=2, random_state=42 ), "Truncated SVD embedding": TruncatedSVD(n_components=2), "Linear Discriminant Analysis embedding": LinearDiscriminantAnalysis( n_components=2 ), "Isomap embedding": Isomap(n_neighbors=n_neighbors, n_components=2), "Standard LLE embedding": LocallyLinearEmbedding( n_neighbors=n_neighbors, n_components=2, method="standard" ), "Modified LLE embedding": LocallyLinearEmbedding( n_neighbors=n_neighbors, n_components=2, method="modified" ), "Hessian LLE embedding": LocallyLinearEmbedding( n_neighbors=n_neighbors, n_components=2, method="hessian" ), "LTSA LLE embedding": LocallyLinearEmbedding( n_neighbors=n_neighbors, n_components=2, method="ltsa" ), "MDS embedding": MDS( n_components=2, n_init=1, max_iter=120, n_jobs=2, normalized_stress="auto" ), "Random Trees embedding": make_pipeline( RandomTreesEmbedding(n_estimators=200, max_depth=5, random_state=0), TruncatedSVD(n_components=2), ), "Spectral embedding": SpectralEmbedding( n_components=2, random_state=0, eigen_solver="arpack" ), "t-SNE embeedding": TSNE( n_components=2, n_iter=500, n_iter_without_progress=150, n_jobs=2, random_state=0, ), "NCA embedding": NeighborhoodComponentsAnalysis( n_components=2, init="pca", random_state=0 ), }

Once we declared all the methodes of interest, we can run and perform the projection of the original data. We will store the projected data as well as the computational time needed to perform each projection.

from time import time projections, timing = {}, {} for name, transformer in embeddings.items(): if name.startswith("Linear Discriminant Analysis"): data = X.copy() data.flat[:: X.shape[1] + 1] += 0.01 # Make X invertible else: data = X print(f"Computing {name}...") start_time = time() projections[name] = transformer.fit_transform(data, y) timing[name] = time() - start_time Computing Random projection embedding... Computing Truncated SVD embedding... Computing Linear Discriminant Analysis embedding... Computing Isomap embedding... Computing Standard LLE embedding... Computing Modified LLE embedding... Computing Hessian LLE embedding... Computing LTSA LLE embedding... Computing MDS embedding... Computing Random Trees embedding... Computing Spectral embedding... Computing t-SNE embeedding... Computing NCA embedding...

Finally, we can plot the resulting projection given by each method.

for name in timing: title = f"{name} (time {timing[name]:.3f}s)" plot_embedding(projections[name], title) plt.show()

Total running time of the script: ( 0 minutes 16.705 seconds)

Download Python source code: plot_lle_digits.py

Download Jupyter notebook: plot_lle_digits.ipynb

Gallery generated by Sphinx-Gallery



【本文地址】


今日新闻


推荐新闻


CopyRight 2018-2019 办公设备维修网 版权所有 豫ICP备15022753号-3